Climate metrics

Load necessary packages and define directories.

Spatial metrics

Load Gross Primary Productivity (GPP) MODIS (2000–2022) and ERA5 (1997–2020) climate metrics computed in Earth Engine and convert to long format. These metrics have been spatially aggregated to county boundaries as depicted by Figure 4 in (Kenduiywo, Ghosh, Hijmans, & Ndungu, 2020).

Format the date variable by removing unnecessary characters.

County date gpp
Baringo 2000-02-26 130.1147
Bomet 2000-02-26 364.8841
County date temp
Baringo 1997-01-01 297.4850
Bomet 1997-01-01 292.9671
County date precip
Baringo 1997-01-01 0.0001701
Bomet 1997-01-01 0.0008220

Spatial-temporal metrics

Temporally aggregate the spatial metrics per year to obtain spatial-temporal climate metrics. Event of conflicts from violence against civilians will also be aggregated per year and county.

gpp$date <- format(as.Date(gpp$date, format="%d/%m/%Y"),"%Y")
gpp <- aggregate(.~County+date, data=gpp, mean, na.rm=T)
tmp$date <- format(as.Date(tmp$date, format="%d/%m/%Y"),"%Y")
tmp <- aggregate(.~County+date, data=tmp, mean, na.rm=T)
prec$date <- format(as.Date(prec$date, format="%d/%m/%Y"),"%Y")
prec <- aggregate(.~County+date, data=prec, mean, na.rm=T)

con <- readRDS("civilianViolence.rds")
con <- subset(con@data, select= c(ADMIN1, FATALITIES, YEAR))
names(con)[1] <- "County"
names(con)[3] <- "date"
con <- aggregate(.~County+date, data=con, sum, na.rm=T)

Let’s make some plots and evaluate trends. First merge each jmetric with conflict data. To do so check for county naming consistencies.

con$County <- toupper(con$County)
gpp$County <- toupper(gpp$County)
tmp$County <- toupper(tmp$County)
prec$County <- toupper(prec$County)
c <- sort(unique(con$County))
c[!c %in% sort(unique(gpp$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"
c[!c %in% sort(unique(tmp$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"
c[!c %in% sort(unique(prec$County))]
## [1] "ELGEYO MARAKWET" "MURANGA"

Edit the inconsistencies on the identified counties.

con$County[con$County == "MURANGA"] <- "MURANG'A"
con$County[con$County == "ELGEYO MARAKWET" ] <- "ELGEYO-MARAKWET"

We can also compute anomalies for these climate metrics. The Z-score is a good measure to capture anomalies. Z-scores indicate the deviations of seasonal/annual metrics from its long-term annual mean. The unit is “standard deviations”. This a z-score of -1 means that the value is 1 standard deviation below the (expected) mean value i.e.;

zscore <- function(y){
  (y - mean(y, na.rm=TRUE) ) / (sd(y, na.rm=TRUE))
}

Compute the zScored metrics per year per Country

gpp <- ungroup(mutate(group_by(gpp,County), zgpp=zscore(gpp)))
tmp <- ungroup(mutate(group_by(tmp,County), ztemp=zscore(temp)))
prec <- ungroup(mutate(group_by(prec,County), zprecip=zscore(precip)))
con <- ungroup(mutate(group_by(con,County), zFATALITIES=zscore(FATALITIES)))

Vulnerable areas

We can conduct spatial regression to evaluate vulnerable areas. For this Geographically weighted regression (GWR) is adopted. GWR is an exploratory technique mainly intended to indicate where non-stationarity is taking place on the map, that is where locally weighted regression coefficients move away from their global values. Its basis is the concern that the fitted coefficient values of a global model, fitted to all the data, may not represent detailed local variations in the data adequately - in this it follows other local regression implementations. It differs, however, in not looking for local variation in ‘data’ space, but by moving a weighted window over the data, estimating one set of coefficient values at every chosen ‘fit’ point. The fit points are very often the points at which observations were made, but do not have to be. If the local coefficients vary in space, it can be taken as an indication of non-stationarity.

library(raster)
## 
## Attaching package: 'raster'
## The following object is masked from 'package:dplyr':
## 
##     select
ken <- getData("GADM", country="KEN", level=1)
## Warning in getData("GADM", country = "KEN", level = 1): getData will be removed in a future version of raster
## . You can use access these data, and more, with functions from the geodata package instead
ken$NAME_1 <- toupper(ken$NAME_1)
names(ken)[4] <- "County"
ken <- ken[,"County"]

Merge County boundaries with data.frame with Climate metrics and conflict information.

#ken <- merge(ken, dff, by="County",  duplicateGeoms = T)
df <- aggregate(.~County, data = dff[,-2], mean, na.rm=T)
ken <- merge(ken, df, by="County")

sp.na.omit <- function(x, margin=1) {
  if (!inherits(x, "SpatialPointsDataFrame") & !inherits(x, "SpatialPolygonsDataFrame")) 
    stop("MUST BE sp SpatialPointsDataFrame OR SpatialPolygonsDataFrame CLASS OBJECT") 
  na.index <- unique(as.data.frame(which(is.na(x@data),arr.ind=TRUE))[,margin])
    if(margin == 1) {  
      cat("DELETING ROWS: ", na.index, "\n") 
        return( x[-na.index,]  ) 
    }
    if(margin == 2) {  
      cat("DELETING COLUMNS: ", na.index, "\n") 
        return( x[,-na.index]  ) 
    }
 }
#shapefile(ken, "ken_conflicts.shp", overwrite=TRUE)
ken <- sp.na.omit(ken)
## DELETING ROWS:  41

GWR

Prior to running the GWR model we need to calculate a kernel bandwidth. This will determine now the GWR subsets the data when its test multiple models across space.

if (!"spgwr" %in% installed.packages()){
  install.packages("spgwr", dependencies = T)
}
library(spgwr)
GWRbandwidth <- gwr.sel(FATALITIES ~ precip, data=ken, adapt=T)
## Adaptive q: 0.381966 CV score: 1922.11 
## Adaptive q: 0.618034 CV score: 1999.398 
## Adaptive q: 0.236068 CV score: 1872.682 
## Adaptive q: 0.145898 CV score: 1857.621 
## Adaptive q: 0.07619138 CV score: 1855.048 
## Adaptive q: 0.08837297 CV score: 1861.617 
## Adaptive q: 0.04708886 CV score: 1848.961 
## Adaptive q: 0.02910252 CV score: 1804.984 
## Adaptive q: 0.01798634 CV score: 1821.511 
## Adaptive q: 0.02904693 CV score: 1804.821 
## Adaptive q: 0.02541111 CV score: 1795.552 
## Adaptive q: 0.0225751 CV score: 1790.406 
## Adaptive q: 0.02082235 CV score: 1790.541 
## Adaptive q: 0.02179189 CV score: 1789.297 
## Adaptive q: 0.02172365 CV score: 1789.227 
## Adaptive q: 0.02137939 CV score: 1789.417 
## Adaptive q: 0.02162399 CV score: 1789.246 
## Adaptive q: 0.02168296 CV score: 1789.232 
## Adaptive q: 0.02172365 CV score: 1789.227

Run GWR model and view the results.

gwr.model = gwr(FATALITIES ~ precip, data=ken, adapt=GWRbandwidth, hatmatrix=TRUE, se.fit=TRUE) 
## Warning in proj4string(data): CRS object has comment, which is lost in output; in tests, see
## https://cran.r-project.org/web/packages/sp/vignettes/CRS_warnings.html
gwr.model
## Call:
## gwr(formula = FATALITIES ~ precip, data = ken, adapt = GWRbandwidth, 
##     hatmatrix = TRUE, se.fit = TRUE)
## Kernel function: gwr.Gauss 
## Adaptive quantile: 0.02172365 (about 0 of 46 data points)
## Summary of GWR coefficient estimates at data points:
##                     Min.     1st Qu.      Median     3rd Qu.        Max.
## X.Intercept.     -4.1388      3.4140      8.6906     12.1235     46.1066
## precip       -14490.4431  -1573.2499   -994.5646    -50.0824   3721.9219
##                 Global
## X.Intercept.    10.775
## precip       -1076.530
## Number of data points: 46 
## Effective number of parameters (residual: 2traceS - traceS'S): 27.21259 
## Effective degrees of freedom (residual: 2traceS - traceS'S): 18.78741 
## Sigma (residual: 2traceS - traceS'S): 5.666171 
## Effective number of parameters (model: traceS): 20.68544 
## Effective degrees of freedom (model: traceS): 25.31456 
## Sigma (model: traceS): 4.881327 
## Sigma (ML): 3.62113 
## AICc (GWR p. 61, eq 2.33; p. 96, eq. 4.21): 334.4981 
## AIC (GWR p. 96, eq. 4.22): 269.6121 
## Residual sum of squares: 603.1789 
## Quasi-global R2: 0.7337493

This model has a Quasi-global R-squared value of 0.73 . So 73% of the variance can be explained by the model. This indicates a high in-sample prediction accuracy.

We will then bind the outputs to our ken polygon so we can map them.

results <-as.data.frame(gwr.model$SDF)
names(results)
##  [1] "sum.w"               "X.Intercept."        "precip"             
##  [4] "X.Intercept._se"     "precip_se"           "gwr.e"              
##  [7] "pred"                "pred.se"             "localR2"            
## [10] "X.Intercept._se_EDF" "precip_se_EDF"       "pred.se.1"
gwr.map <- cbind(ken, as.matrix(results))

Map the local R2.

#qtm(gwr.map, fill = "localR2")
library(tmap)
tm_shape(gwr.map) + tm_fill("localR2", n = 5, style = "quantile")  + tm_layout(frame = FALSE, legend.text.size = 0.5, legend.title.size = 0.6)
## Variable(s) "localR2" contains positive and negative values, so midpoint is set to 0. Set midpoint = NA to show the full spectrum of the color palette.

Display fatalities and precip

map1=tm_shape(gwr.map) + tm_fill("FATALITIES")  + tm_layout(frame = FALSE, legend.text.size = 0.8, legend.title.size = 0.8)+
  tm_text("County", size = 0.6, remove.overlap = T)+
  tm_format("World")
map2=tm_shape(gwr.map) + tm_fill("precip")  + tm_layout(frame = FALSE, legend.text.size = 0.8, legend.title.size = 0.8)+
  tm_text("County", size = 0.6, remove.overlap = T)+
  tm_format("World")
map1
map2

It can be noted that Turkana, Garissa, and Mandera the most vulnerable counties to civilian violence. This can be linked to adverse climate effects as depicted by low precipitation. The model confidence in this areas is also moderate based on Local R2.

Kenduiywo, B. K., Ghosh, A., Hijmans, R., & Ndungu, L. (2020). MAIZE YIELD ESTIMATION IN KENYA USING MODIS. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, V-3-2020, 477–482. https://doi.org/10.5194/isprs-annals-V-3-2020-477-2020